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1 Vector spaces and dimensionality 

In quantum mechanics the state of a physical system is a vector in a complex vector space. Observables 
are linear operators, in fact, Hermitian operators acting on this complex vector space. The purpose 
of this chapter is to learn the basics of vector spaces, the structures that can be built on those spaces, 
and the operators that act on them. 

Complex vector spaces are somewhat different from the more familiar real vector spaces. I would 
say they have more powerful properties. In order to understand more generally complex vector spaces 
it is useful to compare them often to their real dimensional friends. We will follow here the discussion 
of the book Linear algebra done right, by Sheldon Axler. 

In a vector space one has vectors and numbers. We can add vectors to get vectors and we can 
multiply vectors by numbers to get vectors. If the numbers we use are real, we have a real vector space. 
If the numbers we use are complex, we have a complex vector space. More generally, the numbers 
we use belong to what is called in mathematics a ‘field’ and denoted by the letter F. We will discuss 
just two cases, F = M, meaning that the numbers are real, and F = C, meaning that the numbers are 
complex. 

The definition of a vector space is the same for F being R or C. A vector space V is a set of vectors 
with an operation of addition (+) that assigns an element u + v € V to each u, v € V. This means 
that V is closed under addition. There is also a scalar multiplication by elements of F, with av £ V 


1 



for any a € F and v € V. This means the space V is closed under multiplication by numbers. These 
operations must satisfy the following additional properties: 

1. u + v = v + u € V for all u, v € V (addition is commutative). 

2. u + (v + w) = (u + v) + w and ( ab)u = a(bu ) for any u,v,w € V and a, b € F (associativity). 

3. There is a vector OgV such that (1 + u = u for all u € V (additive identity). 

4. For each v € V there is a u € V such that v + u = 0 (additive inverse). 

5. The element 1 € F satisfies lv = v for all v € V (multiplicative identity). 

6. a(u + v) = au + av and (a + b)v = av + bv for every u, v € V and a, b € F (distributive property). 

This definition is very efficient. Several familiar properties follow from it by short proofs (which 
we will not give, but are not complicated and you may try to produce): 

• The additive identity is unique: any vector O' that acts like 0 is actually equal to 0. 

• Ov = 0, for any v € V, where the first zero is a number and the second one is a vector. This 
means that the number zero acts as expected when multiplying a vector. 

• a0 = 0, for any a £ F. Here both zeroes are vectors. This means that the zero vector multiplied 
by any number is still the zero vector. 

• The additive inverse of any vector v € V is unique. It is denoted by —v and in fact —v = (— l)v. 

We must emphasize that while the numbers, in F are sometimes real or complex, we never speak 
of the vectors themselves as real or complex. A vector multiplied by a complex number is not said to 
be a complex vector, for example! The vectors in a real vector space are not themselves real, nor are 
the vectors in a complex vector space complex. We have the following examples of vector spaces: 


1. The set of A-component vectors 

/ °i \ 

02 

\a N J 


, a, 6 M, i = 1,2,... N. 


form a real vector space. 

2. The set of M x N matrices with complex entries 


(an 

aiN 

021 

02 N 

\OM 1 • 

■■ CLMN/ 


5 ^IJ 


dij € C , 


( 1 . 1 ) 


( 1 . 2 ) 
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is a complex vector space. In here multiplication by a constant multiplies each entry of the 
matrix by the constant. 


3. We can have matrices with complex entries that naturally form a real vector space. The space 
of two-by-two hermitian matrices define a real vector space. They do not form a complex vector 
space since multiplication of a hermitian matrix by a complex number ruins the hermiticity. 

4. The set T’(F) of polynomials p(z). Here the variable z£F and p{z) € F. Each polynomial p(z) 
has coefficients ao, a\,... a n also in F: 

p(z) = a 0 + a±z + a 2 z 2 + ... + a n z n . (1.3) 


By definition, the integer n is finite but it can take any nonnegative value. Addition of poly¬ 
nomials works as expected and multiplication by a constant is also the obvious multiplication. 
The space T’(F) of all polynomials so defined form a vector space over F. 


5. The set F°° of infinite sequences {x\,x 2 , ...) of elements Xi € F. Here 

(x 1 ,x 2 ,...) + (yi,y 2 ,...) = (xi + yi,x 2 + y 2 ,...) 

a(x i,x 2 ,...) = (axi, ax 2 ,...) aeF. 

This is a vector space over F. 


(1.4) 


6. The set of complex functions on an interval x € [0, L\, form a vector space over C. 


To better understand a vector space one can try to figure out its possible subspaces. A subspace 
of a vector space V is a subset of V that is also a vector space. To verify that a subset U of V is a 
subspace you must check that U contains the vector 0, and that U is closed under addition and scalar 
multiplication. 

Sometimes a vector space V can be described clearly in terms of collection U\, U 2 ,... U rn of sub¬ 
spaces of V. We say that the space V is the direct sum of the subspaces U\,U 2 , ■ ■ ■ U m and we 
write 

V = U\ 0 U 2 0 • • • 0 U m (1-5) 

if any vector in V can be written uniquely as the sum u\ + u 2 + ... + u m , where iq G t/j. To check 
uniqueness one can, alternatively, verify that the only way to write 0 as a sum u\ + u 2 + ... + u m with 
Ui € Ui is by taking all Ui s equal to zero. For the case of two subspaces V = JJ 0 W, it suffices to 
prove that any vector can be written as u + w with u G U and w € W and that U D W = 0. 

Given a vector space we can produce lists of vectors. A list (vi,v 2 ,..., v n ) of vectors in V contains, 
by definition, a finite number of vectors. The number of vectors in the list is the length of the list. 
The span of a list of vectors (vi,v 2 , ■ • • v n ) in V. denoted as span(ui, v 2 , ■ ■ ■ , v n ), is the set of all linear 
combinations of these vectors 


a\V\ + a 2 v 2 + ... a n v n , a* € F 


( 1 . 6 ) 
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A vector space V is spanned by a list (v±,V 2 , - ■ ■ v n ) if V = span(wi, V 2 , ■ ■ ■ v n ). 

Now comes a very natural definition: A vector space V is said to be finite dimensional if it is 
spanned by some list of vectors in V. If V is not finite dimensional, it is infinite dimensional. In 
such case, no list of vectors from V can span V . 

Let us show that the vector space of all polynomials p(z) considered in Example 4 is an infinite 
dimensional vector space. Indeed, consider any list of polynomials. In this list there is a polynomial 
of maximum degree (recall the list is finite). Thus polynomials of higher degree are not in the span of 
the list. Since no list can span the space, it is infinite dimensional. 

For example 1, consider the list of vectors (ei, e 2 ,... ejsi) with 


/1\ 


fo\ 


/o\ 

0 


1 


0 


, e 2 = 


, ... ejv = 


• 0 


• 0 


W 


This list spans the space (the vector displayed is aiei + 0262 + ... This vector space is finite 

dimensional. 

A list of vectors (iq, V 2 , ■ ■ ■ ,v n ), with vi € V is said to be linearly independent if the equation 


aivi + a 2 V 2 + • • • + a n v n = 0 , 


( 1 . 8 ) 


only has the solution ai = 02 = • • • = a n = 0. One can show that the length of any linearly independent 
list is shorter or equal to the length of any spanning list. This is reasonable, because spanning lists 
can be arbitrarily long (adding vectors to a spanning list gives still a spanning list), but a linearly 
independent list cannot be enlarged beyond a certain point. 

Finally, we get to the concept of a basis for a vector space. A basis of V is a list of vectors 
in V that both spans V and it is linearly independent. Mathematicians easily prove that any finite 
dimensional vector space has a basis. Moreover, all bases of a finite dimensional vector space have the 
same length. The dimension of a finite-dimensional vector space is given by the length of any list of 
basis vectors. One can also show that for a finite dimensional vector space a list of vectors of length 
dim V is a basis if it is linearly independent list or if it is a spanning list. 

For example 1 we see that the list (ei, e 2 ,... e tv) in (1.7) is not only a spanning list but a linearly 
independent list (prove it!). Thus the dimensionality of this space is N. 

For example 3, recall that the most general hermitian two-by-two matrix takes the form 


( a Q + 0.3 a-i — ia 2 
a\ + ia 2 oq — 03 


Oq, Oi, 02,03 € M. 


(1.9) 


Now consider the following list of four ‘vectors’ (1, 01 , 02 , 03 ). All entries in this list are hermitian 
matrices, so this is a list of vectors in the space. Moreover they span the space since the most general 
hermitian matrix, as shown above, is simply oq! + a\a\ +0202 + 0303 . The list is linearly independent 
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as a 0 l + a\(Ji + 0202 + 0303 = 0 implies that 


/ a 0 + a 3 ai - ia 2 \ A) 0 

\ai + ia 2 a 0 - a 3 J yO 0 

and you can quickly see that this implies do, 01 , a 2l and 03 are zero. So the list is a basis and the space 
in question is a four-dimensional real vector space. 

Exercise. Explain why the vector space in example 2 has dimension M ■ N. 

It seems pretty obvious that the vector space in example 5 is infinite dimensional, but it actually 
takes a bit of work to prove it. 


( 1 . 10 ) 


2 Linear operators and matrices 

A linear map refers in general to a certain kind of function from one vector space V to another vector 
space W. When the linear map takes the vector space V to itself, we call the linear map a linear 
operator. We will focus our attention on those operators. Let us then define a linear operator. 

A linear operator T on a vector space V is a function that takes V to V with the properties: 

1. T(u + v) = Tu + Tv, for all u, v € V. 

2. T(au ) = aTu , for all a € F and u € V. 

We call C(V) the set of all linear operators that act on V. This can be a very interesting set, as we 
will see below. Let us consider a few examples of linear operators. 

1. Let V denote the space of real polynomials p(x) of a real variable x with real coefficients. Here 
are two linear operators: 

• Let T denote differentiation: Tp = p'. This operator is linear because (p\ + p 2 )' = p\ + p 2 
and ( ap)' = up'. 

• Let S denote multiplication by x: Sp = xp. S is also a linear operator. 

2. In the space F°° of infinite sequences define the left-shift operator L by 

L(xi,x 2 ,x 3 , ...) = (x 2 ,x 3 ,...). (2.11) 

We lose the first entry, but that is perfectly consistent with linearity. We also have the right-shift 
operator R that acts as follows: 


R{x i,x 2 , • • •) = (0, x\,x 2 ,...). (2.12) 

Note that the first entry in the result is zero. It could not be any other number because the zero 
element (a sequence of all zeroes) should be mapped to itself (by linearity). 
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3. For any V, the zero map 0 such that On = 0. This map is linear and maps all elements of V to 
the zero element. 

4. For any V, the identity map / for which Iv = v for all v € V. This map leaves all vectors 
invariant. 


Since operators on V can be added and can also be multiplied by numbers, the set C(V) introduced 
above is itself a vector space (the vectors being the operators!). Indeed for any two operators T, S € 
C(V) we have the natural definition 


(S + T)v = Sv + Tv, 
( aS)v = a(Sv ). 


(2.13) 


The additive identity in the vector space C{V) is the zero map of example 3. 

In this vector space there is a surprising new structure: the vectors (the operators!) can be 
multiplied. There is a multiplication of linear operators that gives a linear operator. We just let one 
operator act first and the second later. So given 5,T £ £(V) we define the operator ST as 


(ST)v = S(Tv) 


(2.14) 


You should convince yourself that ST is a linear operator. This product structure in the space of 
linear operators is associative: S(TU) = ( ST)U , for S,T,U, linear operators. Moreover it has an 
identity element: the identity map of example 4. Most crucially this multiplication is, in general, 
noncommutative. We can check this using the two operators T and S of example 1 acting on the 
polynomial p = x n . Since T differentiates and S multiplies by x we get 

{TS)x n = T(Sx n ) = T(x n+1 ) = (n + l)x n , while ( ST)x n = S(Tx n ) = S^x"- 1 ) = nx n . (2.15) 


We can quantify this failure of commutativity by writing the difference 


(TS - ST)x n = (n + l)x n - nx n = x n = Ix n (2.16) 

where we inserted the identity operator at the last step. Since this relation is true for any x n , it would 
also hold acting on any polynomial, namely on any element of the vector space. So we write 


[T,S] = I. (2.17) 

where we introduced the commutator [-, •] of two operators X, Y, defined as [X, Y] = XY — YX. 

The most basic features of an operator are captured by two simple concepts: its null space and its 
range. Given some linear operator T on V it is of interest to consider those elements of V that are 
mapped to the zero element. The null space (or kernel) of T € C(V) is the subset of vectors in V 
that are mapped to zero by T : 

null T = {v£V;Tv = 0} . (2-18) 


6 



Actually nullT is a subspace of V (The only nontrivial part of this proof is to show that T(0) = 0. 
This follows from T(0) = T(0 + 0) = T(0) + T( 0) and then adding to both sides of this equation the 
additive inverse to T(0)). 

A linear operator T : V —>• V is said to be injective if Tu = Tv, with u, v € V, implies u = v. An 
injective map is called a one-to-one map, because not two different elements can be mapped to the 
same one. In fact, physicist Sean Carroll has suggested that a better name would be two-to-two as 
injectivity really means that two different elements are mapped by T to two different elements! We 
leave for you as an exercise to prove the following important characterization of injective maps: 

Exercise. Show that T is injective if and only if nullT = {0}. 

Given a linear operator T on V it is also of interest to consider the elements of V of the form Tv. 
The linear operator may not produce by its action all of the elements of V. We define the range of 
T as the image of V under the map T : 


range T = {Tv, (2-19) 

Actually range T is a subspace of V (can you prove it?). The linear operator T is said to be surjective 
if range T = V. That is, if the image of V under T is the complete V. 

Since both the null space and the range of a linear operator T : V —>• V are subspaces of V, one 
can assign a dimension to them, and the following theorem is nontrivial: 

dim V = dim (null T) + dim (range T) . (2.20) 


Example. Describe the null space and range of the operator 



( 2 . 21 ) 


Let us now consider invertible linear operators. A linear operator T e C{V) is invertible if there 
exists another linear operator S € C(V) such that ST and TS are identity maps (written as I). The 
linear operator S is called the inverse of T. The inverse is actually unique. Say S and S' are inverses 
of T. Then we have 

S = SI = SiTS') = {ST) S' = IS' = S'. (2.22) 


Note that we required the inverse S to be an inverse acting from the left and acting from the right. 
This is useful for infinite dimensional vector spaces. For finite-dimensional vector spaces one suffices; 
one can then show that ST = I if and only if TS = I. 

It is useful to have a good characterization of invertible linear operators. For a finite-dimensional 
vector space V the following three statements are equivalent! 


Finite dimension: 


T is invertible 

4—> 

T is injective 

4—> 

T is surjective 


(2.23) 
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For infinite dimensional vector spaces injectivity and surjectivity are not equivalent (each can fail 
independently). In that case invertibility is equivalent to injectivity plus surjectivity: 


Infinite dimension: 


T is invertible 


T is injective and surjective 


(2.24) 


The left shift operator L is not injective (maps (xi,0, 
operator is not surjective although it is injective. 


to zero) but it is surjective. The right shift 


Now we consider the matrix associated to a linear operator T that acts on a vector space V. 
This matrix will depend on the basis we choose for V. Let us declare that our basis is the list 
(vi,V 2 , ■ ■ ■ v n ). It is clear that the full knowledge of the action of T on V is encoded in the action of 
T on the basis vectors, that is on the values (Tv\, Tv 2 , ■ ■ ■ ,Tv n ). Since Tvj is in V, it can be written 
as a linear combination of basis vectors. We then have 

(2.25) 

where we introduced the constants T hJ that are known if the operator T is known. As we will see, 
these are the entries form the matrix representation of the operator T in the chosen basis. The above 
relation can be written more briefly as 

(2.26) 

When we deal with different bases it can be useful to use notation where we replace 


Tv j = J2 T ij v i- 

i= 1 


Tvj = Tijvi + T 2j v 2 + ... +T nj v n , 


Tij —>• Tij({v}), 


(2.27) 


so that it makes clear that T is being represented using the v basis (iq,..., v n ). 

I want to make clear why (2.25) is reasonable before we show that it makes for a consistent 
association between operator multiplication and matrix multiplication. The left-hand side, where we 
have the action of the matrix for T on the j-th basis vector, can be viewed concretely as 


Tvj 


(Tn Tij -.. T ln \ 

T 21 ■ ■ ■ T2j ■ ■ ■ T 2n 

\Tnl ■ ■ ■ Tnj ■ ■ ■ T nn ) 


M 

1 

w 


j-th position 


(2.28) 


where the column vector has zeroes everywhere except on the j-th entry. The product, by the usual 
rule of matrix multiplication is the column vector 


(T,j\ 

Uj 

= Tii 

0 

+ r 2j 

fo\ 

1 

+ • • • + T n j 

fo\ 

0 

V«i/ 


w 


W 


W 
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<—> 


T\ jv 1 + ... T n jV n . 


(2.29) 







which we identify with the right-hand side of (2.25). So (2.25) is reasonable. 

Exercise. Verify that the matrix representation of the identity operator is a diagonal matrix with an 
entry of one at each element of the diagonal. This is true for any basis. 

Let us now examine the product of two operators and their matrix representation. Consider the 
operator TS acting on vf 

(TS)vj = T(Svj) = T^^SpjVp = S p j Tv p = ^ S p j '^ j T i pVi (2.30) 

p p pi 

so that changing the order of the sums we find 

(TS) Vj = J2(£ T ip S Pj) v i- ( 2 - 31 ) 

i p 

Using the identification implicit in (2.26) we see that the object in parenthesis is the i, j matrix element 
of the matrix that represents TS. Therefore we found 

(TS)ij = Yl T ip S Pi’ (2-32) 

p 

which is precisely the right formula for matrix multiplication. In other words, the matrix that repre¬ 
sents TS 1 is the product of the matrix that represents T with the matrix that represents S, in that 
order. 

Changing basis 

While matrix representations are very useful for concrete visualization, they are basis dependent. 
It is a good idea to try to figure out if there are quantities that can be calculated using a matrix 
representation that are, nevertheless, guaranteed to be basis independent. One such quantity is the 
trace of the matrix representation of a linear operator. The trace is the sum of the matrix elements 
in the diagonal. Remarkably, that sum is the same independent of the basis used. Consider a linear 
operator T in C(V) and two sets of basis vectors (iq,..., v n ) and (u\ ,..., u n ) for V. Using the explicit 
notation (2.27) for the matrix representation we state this property as 

tr T({u}) = tr T({«}). (2.33) 

We will establish this result below. On the other hand, if this trace is actually basis independent, there 
should be a way to define the trace of the linear operator T without using its matrix representation. 
This is actually possible, as we will see. Another basis independent quantity is the determinant of the 
matrix representation of T. 

Let us then consider the effect of a change of basis on the matrix representation of an operator. 
Consider a vector space V and a change of basis from (v\,... v n ) to (u\,... u n ) defined by the linear 
operator A as follows: 

A : Vk Uk, for k = 1,..., n . (2.34) 
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This can also be written as 

Av k = u k (2.35) 

Since we know how A acts on every element of the basis we know, by linearity how it acts on any 
vector. The operator A is clearly invertible because, letting B : u k Vk or 

Bu k = v k , (2.36) 


we have 


BAv k = B(Avk) = Bu k = v k 
ABuk = A(Buk) = Av k = Uk , 


(2.37) 


showing that BA = / and AB = I. Thus B is the inverse of A. Using the definition of matrix 
representation, the right-hand sides of the relations Uk = Avk and Vk = Buk can be written so that 
the equations take the form 

Uk — Ajk Uj , Vk — Uj , (2.38) 

where we used the convention that repeated indices are summed over. Ajj are the elements of the 
matrix representation of A in the v basis and Bij are the elements of the matrix representation of B 
in the u basis. Replacing the second relation on the first, and then replacing the first on the second 
we get 


Uk — A j k B^ Ui — BijAjk Ui 
Vk — Aj k Aij Vi — AijBjk Vi 


Since the u’s and v's are basis vectors we must have 


(2.39) 


BijAjk — Sjk and AijBjk — Sik 


(2.40) 


which means that the B matrix is the inverse of the A matrix. We have thus learned that 


v k = {A l )jkUj . 


(2.41) 


We can now apply these preparatory results to the matrix representations of the operator T. We 
have, by definition, 

Tv k = T ik ({v})vi. (2.42) 

We now want to calculate T on Uk so that we can read the formula for the matrix T on the u basis: 


Tuk — Tik({u\) Ui. 

Computing the left-hand side, using the linearity of the operator T, we have 

Tu k = T(AjkVj) = A jk Tvj = A jk T pj ({v}) v p 


(2.43) 


(2.44) 
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and using (2.41) we get 

Tu k = A jk T pj ({v}) (A~ 1 ) ip u i = [{A~ l )i p T pj {{v}) A jk ^jui = (A~ l T({v})A) . k m . (2.45) 

Comparing with (2.43) we get 

r«(M) = (J-'rfWM). -» 

This is the result we wanted to obtain. 

The trace of a matrix Tij is given by Ta , where sum over i is understood 
of T is basis independent we write 

tr(T({u})) = T j:? :(M) = {A~%T jk {{v})A ki 

= A ki (A~ l ) i: jT jk ({v}) 

= S kj T jk ({v}) = Tjj({v}) = tr(T({n})) 

For the determinant we recall that det(AB) = (detvl)(det-B). Therefore det(A) det(A _1 ) = 1. From 
(2.46) we then get 

detT({u}) = det(A _1 ) detT({u}) det A = detT({u}). (2.48) 

Thus the determinant of the matrix that represents a linear operator is independent of the basis used. 

3 Eigenvalues and eigenvectors 

In quantum mechanics we need to consider eigenvalues and eigenstates of hermitian operators acting on 
complex vector spaces. These operators are called observables and their eigenvalues represent possible 
results of a measurement. In order to acquire a better perspective on these matters, we consider the 
eigenvalue/eigenvector problem in more generality. 

One way to understand the action of an operator T € C{V) on a vector space V is to understand 
how it acts on subspaces of V, as those are smaller than V and thus possibly simpler to deal with. Let 
U denote a subspace of V. In general, the action of T may take elements of U outside U. We have a 
noteworthy situation if T acting on any element of U gives an element of U. In this case U is said to 
be invariant under T, and T is then a well-defined linear operator on U. A very interesting situation 
arises if a suitable list of invariant subspaces give the space V as a direct sum. 

Of all subspaces, one-dimensional ones are the simplest. Given some vector u € V one can consider 
the one-dimensional subspace U spanned by u: 

U = {cu : c €E F} . (3.49) 


. To show that the trace 


(2.47) 


T({u}) = A~ l T({v})A 


(2.46) 
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We can ask if the one-dimensional subspace U is left invariant by the operator T. For this Tu must 
be equal to a number times u, as this guarantees that Tu € U. Calling the number A, we write 

Tu = Xu. (3.50) 

This equation is so ubiquitous that names have been invented to label the objects involved. The 
number A E IF is called an eigenvalue of the linear operator T if there is a nonzero vector u € V 
such that the equation above is satisfied. Suppose we find for some specific A a nonzero vector u 
satisfying this equation. Then it follows that cu, for any c € F also satisfies equation (3.50), so that 
the solution space of the equation includes the subspace U, which is now said to be an invariant 
subspace under T. It is convenient to call any vector that satisfies (3.50) for a given A an eigenvector 
of T corresponding to A. In doing so we are including the zero vector as a solution and thus as an 
eigenvector. It can often happen that for a given A there are several linearly independent eigenvectors. 
In this case the invariant subspace associated with the eigenvalue A is higher dimensional. The set of 
eigenvalues of T is called the spectrum of T. 

Our equation above is equivalent to 


(T - XI) u = 0, 

for some nonzero u. It is therefore the case that 


A is an eigenvalue 


(T — XI) not injective. 


(3.51) 


(3.52) 


Using (2.23) we conclude that A is an eigenvalue also means that (T — XI) is not invertible, and not 
surjective. We also note that 


Set of eigenvectors of T corresponding to A = null (T — XI ). (3.53) 

It should be emphasized that the eigenvalues of T and the invariant subspaces (or eigenvectors as¬ 
sociated with fixed eigenvalues) are basis independent objects. Nowhere in our discussion we had to 
invoke the use of a basis, nor we had to use any matrix representation. Below, we will discuss the 
familiar calculation of eigenvalues and eigenvectors using a matrix representation of the operator T in 
some particular basis. 

Let us consider some examples. Take a real three-dimensional vector space V (our space to great 
accuracy!). Consider the rotation operator T that rotates all vectors by a fixed angle small about 
the z axis. To find eigenvalues and eigenvectors we just think of the invariant subspaces. We must 
ask which are the vectors for which this rotation doesn’t change their direction and effectively just 
multiplies them by a number? Only the vectors along the ^-direction do not change direction upon 
this rotation. So the vector space spanned by e z is the invariant subspace, or the space of eigenvectors. 
The eigenvectors are associated with the eigenvalue of one, as the vectors are not altered at all by the 
rotation. 


12 





Consider now the case where T is a rotation by ninety degrees on a two-dimensional real vector 
space V. Are there one-dimensional subspaces left invariant by T? No, all vectors are rotated, none 
remains pointing in the same direction. Thus there are no eigenvalues, nor, of course, eigenvectors. 
If you tried calculating the eigenvalues by the usual recipe, you will find complex numbers. A complex 
eigenvalue is meaningless in a real vector space. 

Although we will not prove the following result, it follows from the facts we have introduced and 
no extra machinery. It is of interest being completely general and valid for both real and complex 
vector spaces: 

Theorem: Let T e C{V) and assume Ai,... \ n are distinct eigenvalues of T and u\,... u n are corre¬ 
sponding nonzero eigenvectors. Then (iq,... u n ) are linearly independent. 

Note that we cannot ask if the eigenvectors are orthogonal to each other as we have not yet 
introduced an inner product on the vector space V. In this theorem there may be more than one 
linearly independent eigenvector associated with some eigenvalues. In that case any one eigenvector 
will do. Since an n-dimensional vector space V does not have more than n linearly independent 
vectors, no linear operator on V can have more than n distinct eigenvalues. 

We saw that some linear operators in real vector spaces can fail to have eigenvalues. Complex 
vector spaces are nicer. In fact, every linear operator on a finite-dimensional complex vector space has 
at least one eigenvalue. This is a fundamental result. It can be proven without using determinants 
with an elegant argument, but the proof using determinants is quite short. 

When A is an eigenvalue, we have seen that T — XI is not an invertible operator. This also 
means that using any basis, the matrix representative of T — XI is non-invertible. The condition of 
non-invertibility of a matrix is identical to the condition that its determinant vanish: 


det(T — Al) = 0 . 

This condition, in an N-dimensional vector space looks like 


det 


(3.54) 


(T u - A 

T\2 .. 

Tin 

\ 

T 2 i 

T‘22 — A . , 

T 2 n 


V T m 

TN2 

■ ■ T N n ~ 

V 


= 0 . 


(3.55) 


The left-hand side is a polynomial /(A) in A of degree N called the characteristic polynomial: 


/(A) = det(T-Al) = (-A) 7V + 6 J v~iA iV - 1 + ...6iA + 6 0 , 


L N—l 


(3.56) 


where the hi are constants. We are interested in the equation /(A) = 0, as this determines all possible 
eigenvalues. If we are working on real vector spaces, the constants 5* are real but there is no guarantee 
of real roots for /(A) = 0. With complex vector spaces, the constants b t will be complex, but a complex 
solution for /(A) = 0 always exists. Indeed, over the complex numbers we can factor the polynomial 
/(A) as follows 

/(A) = (—1) JV '(A — Ai)(A — A 2 )... (A — Ajv) , (3.57) 
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where the notation does not preclude the possibility that some of the Aj’s may be equal. The Aj’s 
are the eigenvalues, since they lead to /(A) = 0 for A = A*. If all eigenvalues of T are different 
the spectrum of T is said to be non-degenerate. If an eigenvalue appears k times it is said to be a 
degenerate eigenvalue with of multiplicity k. Even in the most degenerate case we must have at least 
one eigenvalue. The eigenvectors exist because (T — XI) non-invertible means it is not injective, and 
therefore there are nonzero vectors that are mapped to zero by this operator. 

4 Inner products 

We have been able to go a long way without introducing extra structure on the vector spaces. We 
have considered linear operators, matrix representations, traces, invariant subspaces, eigenvalues and 
eigenvectors. It is now time to put some additional structure on the vector spaces. In this section 
we consider a function called an inner product that allows us to construct numbers from vectors. A 
vector space equipped with an inner product is called an inner-product space. 

An inner product on a vector space V over F is a machine that takes an ordered pair of elements 
of V, that is, a first vector and a second vector, and yields a number in F. In order to motivate the 
definition of an inner product we first discuss the familiar way in which we associate a length to a 
vector. 

The length of a vector, or norm of a vector is a real number that is positive or zero, if the vector 
is the zero vector. In M n a vector a = (aq, ... a n ) has norm \a\ defined by 

\a\ = yjal + ... al (4.58) 

Squaring this one may think of \a\ 2 as the dot product of a with a: 

\a\ 2 = a ■ a = a 2 + ... a 2 x (4.59) 

Based on this the dot product of any two vectors a and b is defined by 

a-b = aib\ + ... + a n b n . (4.60) 

If we try to generalize this dot product we may require as needed properties the following 

1 . a ■ a > 0 , for all vectors a. 

2 . a ■ a = 0 if and only if a = 0 . 

3. a ■ [b\ + 62 ) = a ■ b\+ a ■ 62 . Additivity in the second entry. 

4. a ■ (ab) = a a - b, with a a number. 

5. a-b = b ■ a. 
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Along with these axioms, the length |a| of a vector a is the positive or zero number defined by relation 

|a | 2 = a ■ a. (4-61) 

These axioms are satisfied by the definition (4.60) but do not require it. A new dot product defined 
by a ■ b = cia\bi + ... + c n a n b n , with c\.c n positive constants, would do equally well! So whatever 
can be proven with these axioms holds true not only for the conventional dot product. 

The above axioms guarantee that the Schwarz inequality holds: 

|a • b | < | a | | b \ . (4-62) 

To prove this consider two (nonzero) vectors a and b and then consider the shortest vector joining the 
tip of a to the line defined by the direction of b (see the figure below). This is the vector a±, given by 

a± = a — 7 —- b. (4.63) 

b ■ b 

The subscript T is there because the vector is perpendicular to b, namely aj_ • 6 = 0, as you can quickly 
see. To write the above vector we subtracted from a the component of a parallel to b. Note that the 
vector aj_ is not changed as b —» c 6 ; it does not depend on the overall length of b. Moreover, as it 
should, the vector a± is zero if and only if the vectors a and b are parallel. All this is only motivation, 
we could have just said “consider the following vector aj_”. 



Given axiom (1) we have that o_l • o_l > 0 and therefore using (4.63) 

(a • b) 2 


aj_ • aj_ = a ■ a 

Since b is not the zero vector we then have 


b-b 


> 0 . 


(4.64) 


(a ■ b) 2 < (a-a)(b-b). (4.65) 

Taking the square root of this relation we obtain the Schwarz inequality (4.62). The inequality becomes 
an equality only if a_j_ = 0 or, as discussed above, when a = cb with c a real constant. 


For complex vector spaces some modification is necessary. Recall that the length |y| of a complex 
number 7 is given by |y| = v / 7*7> where the asterisk superscript denotes complex conjugation. It is 
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not hard to generalize this a bit. Let z = (z ±,..., z n ) be a vector in C n . Then the length of the vector 
\z\ is a real number greater than zero given by 

\z\ = y/z$z 1 + ... +z*z n . (4.66) 

We must use complex conjugates, denoted by the asterisk superscript, to produce a real number greater 
than or equal to zero. Squaring this we have 

\z\ 2 = z{z\ + ... + z*z n . (4.67) 

This suggests that for vectors z = (z 1 ,, z n ) and w = (w\ ,..., w n ) an inner product could be given 
by 

w\zi + ... + w* n z n , (4.68) 

and we see that we are not treating the two vectors in an equivalent way. There is the first vector, 
in this case w whose components are conjugated and a second vector z whose components are not 
conjugated. If the order of vectors is reversed, we get for the inner product the complex conjugate of 
the original value. As it was mentioned at the beginning of the section, the inner product requires an 
ordered pair of vectors. It certainly does for complex vector spaces. Moreover, one can define an inner 
product in general in a way that applies both to complex and real vector spaces. 

An inner product on a vector space V over F is a map from an ordered pair (u, v ) of vectors 
in V to a number (u, v) in F. The axioms for ( u , v) are inspired by the axioms we listed for the dot 
product. 

1. (v ,v) > 0, for all vectors v € V. 

2 . (v, v) = 0 if and only if v = 0. 

3. (u,v i + V 2 ) = (u , v\) + (u , U 2 ). Additivity in the second entry. 

4. (u,av) = a(u ,v), with a € F. Homogeneity in the second entry. 

5. (u,v) = (v,u)*. Conjugate exchange symmetry. 

This time the norm |u| of a vector v € V is the positive or zero number defined by relation 

\v\ 2 = (v ,v). (4.69) 

From the axioms above, the only major difference is in number five, where we find that the inner 
product is not symmetric. We know what complex conjugation is in C. For the above axioms to 
apply to vector spaces over R we just define the obvious: complex conjugation of a real number is a 
real number. In a real vector space the * conjugation does nothing and the inner product is strictly 
symmetric in its inputs. 
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A few comments. One can use (3) with V2 = 0 to show that (u, 0) = 0 for all u € V, and thus, by 
(5) also (0, it) = 0. Properties (3) and (4) amount to full linearity in the second entry. It is important 
to note that additivity holds for the first entry as well: 

(Ul+U2,v) ={v,Ul+U 2 )* 

= ({v,Ul ) + (v,u 2 ))* 

(4.70 

= (v,ui)* + (v,u 2 )* 

= (ul,v) + (u 2 ,v ). 

Homogeneity works differently on the first entry, however, 

(au , v) = (v , a u)* 

= (a{v,u))* (4.71) 

= a* (u , v). 


Thus we get conjugate homogeneity on the first entry. This is a very important fact. Of course, 
for a real vector space conjugate homogeneity is the same as just plain homogeneity. 

Two vectors it, v € V are said to be orthogonal if (it, v) = 0. This, of course, means that ( v , u) = 0 
as well. The zero vector is orthogonal to all vectors (including itself). Any vector orthogonal to all 
vectors in the vector space must be equal to zero. Indeed, if x € V is such that (x, v) = 0 for all v, 
pick v = x, so that (. x,x ) = 0 implies x = 0 by axiom 2. This property is sometimes stated as the 
non-degeneracy of the inner product. The “Pythagorean” identity holds for the norm-squared of 
orthogonal vectors in an inner-product vector space. As you can quickly verify, 


u + v| 2 = \u\ 2 + |u| 2 , for u, v € V, orthogonal vectors. 


(4.72) 


The Schwarz inequality can be proven by an argument fairly analogous to the one we gave above 
for dot products. The result now reads 


Schwarz Inequality: |(tt,u)| < |u| \v 


(4.73) 


The inequality is saturated if and only if one vector is a multiple of the other. Note that in the 
left-hand side |...| denotes the norm of a complex number and on the right-hand side each |...| denotes 
the norm of a vector. You will prove this identity in a slightly different way in the homework. You 
will also consider there the triangle inequality 


\u Y v\ < |it| + |u|, (4.74) 

which is saturated when u = cv for c a real, positive constant. Our definition (4.69) of norm on a 
vector space V is mathematically sound: a norm is required to satisfy the triangle inequality. Other 
properties are required: (i) |u| > 0 for all v, (ii) |u| = 0 if and only if v = 0, and (iii) \cv\ = |c||a| for c 
some constant. Our norm satisfies all of them. 
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A complex vector space with an inner product as we have defined is a Hilbert space if it is finite 
dimensional. If the vector space is infinite dimensional, an extra completeness requirement must be 
satisfied for the space to be a Hilbert space: all Cauchy sequences of vectors must converge to vectors 
in the space. An infinite sequence of vectors Vi, with i = 1,2,..., oo is a Cauchy sequence if for any 
e > 0 there is an N such that \v n — v m \ < e whenever n, m > N. 

5 Orthonormal basis and orthogonal projectors 

In an inner-product space we can demand that basis vectors have special properties. A list of vectors 
is said to be orthonormal if all vectors have norm one and are pairwise orthogonal. Consider a list 
(ei,..., e n ) of orthonormal vectors in V. Orthonormality means that 

( ei,ej } = Sij. (5.75) 

We also have a simple expression for the norm of a\e\ + ... + a n e n , with a* € F: 

|oiei “b • • • a n e n \ — -)-... -t- a n e n , a\e\ -)-... -t- a n e n ^ 

— (aiei, aiei) + ... + {a n e n , a n e n ) (5.76) 

= |cii|" + ... + |a n | 2 . 

This result implies the somewhat nontrivial fact that the vectors in any orthonormal list are linearly 
independent. Indeed if aiei + ... + a n e n = 0 then its norm is zero and so is |ai| 2 + ... + |a n | 2 . This 
implies all a* = 0, thus proving the claim. 

An orthonormal basis of V is a list of orthonormal vectors that is also a basis for V. Let 
(ei,..., e n ) denote an orthonormal basis. Then any vector v can be written as 

v = aiei+ ... +a n e n , (5-77) 

for some constants a* that can be calculated as follows 

(ei,v) = (ei,aie{) = at, ( i not summed). (5.78) 

Therefore any vector v can be written as 

v = (ei,v)ei+ ... +(e n ,v) = (e i ,'y)e i . (5.79) 

To find an orthonormal basis on an inner product space V we just need to start with a basis and 

then use an algorithm to turn it into an orthogonal basis. In fact, a little more generally: 
Gram-Schmidt: Given a list (iq,... ,v n ) of linearly independent vectors in V one can construct a 
list (ei,..,, e n ) of orthonormal vectors such that both lists span the same subspace of V. 

The Gram-Schmidt algorithm goes as follows. You take e\ to be v\. normalized to have unit norm: 

ei = vi/\vi\. Then take V 2 + ote\ and fix the constant a so that this vector is orthogonal to e±. The 
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answer is clearly V 2 — {e\,v?)e\. This vector, normalized by dividing it by its norm, is set equal to e 2 - 
In fact we can write the general vector in a recursive fashion. If we know e\, 62 , • • •, &j~ 1 , we can write 
e 3 as follows: 


_ Vj - (ei,Vj)ei - ... - (ej-i,ttj)ej_i 
e j 1 / \ / \ 1 fo.oUj 

I'Ll - (ei, Vj)ei - ... - {ej-i,Vj)ej-i\ 

It should be clear to you by inspection that this vector is orthogonal to the vectors e, with i < j and 
has unit norm. The Gram-Schmidt procedure is quite practical. 


With an inner product we can construct interesting subspaces of a vector space V. Consider a 
subset U of vectors in V (not necessarily a subspace). Then we can define a subspace U _L , called the 
orthogonal complement of U as the set of all vectors orthogonal to the vectors in U: 


U = {v € V |(u, u) = 0, for all u € U} . 


(5.81) 


This is clearly a subspace of V. When U is a subspace, then U and U 1 - actually give a direct sum 
decomposition of the full space: 

Theorem: If U is a subspace of V. then V = U © . 

Proof: This is a fundamental result and is not hard to prove. Let (ei,... e n ) be an orthonormal basis 
for U. We can clearly write any vector v in V as 


v = ((ei,v)ei + ... + (e n ,v)e n ) + (v - (e 1 ,v)e 1 - ... - (e n ,v)e n ). (5.82) 


On the right-hand side the first vector in parenthesis is clearly in U as it is written as a linear 
combination of U basis vectors. The second vector is clearly in as one can see that it is orthogonal 
to any vector in U. To complete the proof one must show that there is no vector except the zero 
vector in the intersection U 0 U 1 - (recall the comments below (1.5)). Let v € U n U 1 -. Then v is in U 
and in U 1 - so it should satisfy (v, v ) = 0. But then v = 0, completing the proof. 

Given this decomposition any vector v € V can be written uniquely as v = u + vj where u € U 
and w € U _L . One can define a linear operator Pu, called the orthogonal projection of V onto U, 
that and that acting on v above gives the vector u. It is clear from this definition that: (i) the range 
of Pjj is U. (ii) the null space of Pjj is U - 1 , (iii) that Pu is not invertible and, (iv) acting on U, the 
operator Pu is the identity operator. The formula for the vector u can be read from (5.82) 

Puv = (ei,v)ei + ... + (e n ,v)e n . (5.83) 


It is a straightforward but a good exercise to verify that this formula is consistent with the fact that 
acting on U, the operator Pu is the identity operator. Thus if we act twice in succession with Pu on 
a vector, the second action has no effect as it is already acting on a vector in U. It follows from this 


that 


Pu Pu =IPu = Pu 


Pi 


u 


Pu. 


(5.84) 


The eigenvalues and eigenvectors of Pu are easy to describe. Since all vectors in U are left invariant by 
the action of Pu, an orthonormal basis of U provides a set of orthonormal eigenvectors of P all with 
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eigenvalue one. If we choose on U an orthonormal basis, that basis provides orthonormal eigenvectors 
of P all with eigenvalue zero. 

In fact equation (5.84) implies that the eigenvalues of Pjj can only be one or zero. T he eigenvalues 
of an operator satisfy whatever equation the operator satisfies (as shown by letting the equation act 
on a presumed eigenvector) thus A 2 = A is needed, and this gives A(A — 1) = 0, and A = 0,1, as the 
only possibilities. 

Consider a vector space V = U © U 1 - that is (n + /c)-dimensional, where U is n-dimensional and 
U is fc-dimensional. Let (ei,..., e n ) be an orthonormal basis for U and {f\, ... fk ) an orthonormal 
basis for IT 1 . We then see that the list of vectors (g i,... g n +k ) defined by 

{gi , • • • , gn+k) = (ei,... , e n , /i,... fk) is an orthonormal basis for V. (5.85) 

Exercise: Use P\jCi = e*, for i = 1, ... n and Pjjfi = 0, for i = 1, ..., k, to show that in the above basis 
the projector operator is represented by the diagonal matrix: 

P v = diag( 1__1 , 0 1 _^0 ). (5.86) 

n entries k entries 

We see that, as expected from its non-invertibility, det(Pjy) = 0. But more interestingly we see that 
the trace of the matrix Pjj is n. Therefore 


tr Pjj = dim!/. (5.87) 

The dimension of U is the rank of the projector Pjj. Rank one projectors are the most common 
projectors. They project to one-dimensional subspaces of the vector space. 

Projection operators are useful in quantum mechanics, where observables are described by opera¬ 
tors. The effect of measuring an observable on a physical state vector is to turn this original vector 
instantaneously into another vector. This resulting vector is the orthogonal projection of the original 
vector down to some eigenspace of the operator associated with the observable. 

6 Linear functionals and adjoint operators 

When we consider a linear operator T on a vector space V that has an inner product, we can construct 
a related linear operator T' on V called the adjoint of T. This is a very useful operator and is typically 
different from T. When the adjoint T' happens to be equal to T, the operator is said to be Hermitian. 
To understand adjoints, we first need to develop the concept of a linear functional. 

A linear functional f on the vector space V is a linear map from V to the numbers F: for v € V, 
4>(v) G F. A linear functional has the following two properties: 


1. cj)(v i + v 2 ) = 4>{v\) + 4>(v 2 ) , with vi,v 2 G V. 

2. cf)(av) = a<f>(v) for v G V and a G F. 
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As an example, consider the three-dimensional real vector space M 3 with inner product equal to the 
familiar dot product. Writing a vector v as the triplet v = (vi,V 2 ,V 3 ), we take 

<f>(v) = 3ui + 2 v 2 — 4^3 . (6-1) 


Linearity is clear as the right-hand side features the components v\ , V2, v-j appearing linearly. We can 
use a vector u = (3, 2, —4) to write the linear functional as an inner product. Indeed, one can readily 
see that 

< P(v) = (u,v). (6.2) 

This is no accident, in fact. We can prove that any linear functional 4 >{v) admits such representation 
with some suitable choice of vector u. 

Theorem: Let (j) be a linear functional on V. There is a unique vector u € V such that (j>(v) = (u, v) 
for all 

Proof: Consider an orthonormal basis, (ei,..., e n ) and write the vector v as 

v = (ei,v)ei +... + (e n ,v)e n . (6.3) 


When (j) acts on v we find, first by linearity and then by conjugate homogeneity 

4 >(v) =(j>((e i,v)ei +... +(e n ,v)e n ) 

= (ei,u)(/>(ei) + ... + (e n ,v)4>(e n ) 

= (</>(e i)*ei,v) + ... + (4>(e n )*e n , v) 

= (<j>(e 1 )*e 1 + ... + <t>(e n )*e n , v) . 

We have thus shown that, as claimed 


(6.4) 


(j)(v) = (u , v) with u = 4>(ei)*ei + ... + 4>(e n )*e n . (6.5) 

Next, we prove that this u is unique. If there exists another vector, v! , that also gives the correct 
result for all v, then (u', v) = (u , v), which implies (u — u'. v) = 0 for all v. Taking v = v! — u, we see 
that this shows v! — u = 0 or v! = u, proving uniqueness. 1 
We can modify a bit the notation when needed, to write 

<t>u(v) =(u,v), (6.6) 

where the left-hand side makes it clear that this is a functional acting on v that depends on u. 

We can now address the construction of the adjoint. Consider: 4>(v) = (u,T v), which is clearly 
a linear functional, whatever the operator T is. Since any linear functional can be written as (w,v), 
with some suitable vector w, we write 


(u, Tv) = (w,v), 

1 Tliis theorem holds for infinite dimensional Hilbert spaces, for continuous linear functionals. 


(6.7) 
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Of course, the vector w must depend on the vector u that appears on the left-hand side. Moreover, 
it must have something to do with the operator T, who does not appear anymore on the right-hand 
side. So we must look for some good notation here. We can think of w as a function of the vector u 
and thus write w = T^u where T' denotes a map (not obviously linear) from V to V. So, we think of 
T'u as the vector obtained by acting with some function T t on u. The above equation is written as 

(u , Tv) = (T^u , v) , (6.8) 

Our next step is to show that, in fact, T t is a linear operator on V. The operator T t is called the 
adjoint of T. Consider 

(ui+u 2 ,Tv) = (T\ui + u 2 ),v) , (6.9) 

and work on the left-hand side to get 

(ui+U 2 ,Tv) = (ui,Tv) + (u 2 ,Tv) 

= (T^ui,v) + (T^U 2 ,v) (6.10) 

= (T^ui + T^u 2 , v) . 

Comparing the right-hand sides of the last two equations we get the desired: 

T\ui + u 2 ) = T^ui + T^U2 • ( 6 - 11 ) 

Having established linearity now we establish homogeneity. Consider 

( au, Tv ) = (T\au), v). (6.12) 

The left hand side is 

( au,Tv ) = a*(u,Tv) = a*{T^u,v) = (aT^u,v). (6.13) 

This time we conclude that 

(au) = a T^u. (6-14) 

This concludes the proof that T\ so defined is a linear operator on V. 

A couple of important properties are readily proven: 

Claim: (ST)t = T^S^. We can show this as follows: (u,STv) = (S^u,Tv) = (T^S^u,v). 

Claim: The adjoint of the adjoint is the original operator: (<S^)f = S. We can show this as follows: 
(u,S^v) = ((S^)^u,v). Now, additionally (u,S^v) = (S^v,u)* = (v,Su)* = (Su,v). Comparing with 
the first result, we have shown that (S')'u = Su , for any u, which proves the claim 

Example: Let v = (vi,v 2 ,V 3 ), with Vi € C denote a vector in the three-dimensional complex vector 

space, C 3 . Define a linear operator T that acts on v as follows: 

T(vi,V 2 ,v 3 ) = (Oui + 2v 2 + ivs , v\ — iv 2 + 0u 3 , 3iv\ + v 2 + 7v 3 ). (6.15) 
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Calculate the action of T' on a vector. Give the matrix representations of T and T t using the 
orthonormal basis e\ = (1,0, 0), e 2 = (0,1, 0), e 3 = (0, 0,1). Assume the inner product is the standard 
on on C 3 . 

Solution: We introduce a vector u = { 111 , 112 , 113 ) and will use the basic identity ( u,Tv) = (T^u,v). 
The left-hand side of the identity gives: 

(u,Tv) = u{( 2 v 2 +iv 3 ) + u^vi — W 2 ) + %(3Ai + V 2 + 7 ^ 3 ). (6.16) 

This is now rewritten by factoring the various Uj’s 

(u, Tv) = («2 + 3m 3 )ui + (2 u\ — iu * 2 + u\)v 2 + (iu* + 7 m 3 )^3 . (6-17) 

Identifying the right-hand side with (T'u, v) we now deduce that 

T'(ui,U 2 ,us) = ( U 2 — 3m 3 , 2u\ + iu 2 + , — iu\ + 7u 3 ). (6.18) 

This gives the action of TK To find the matrix representation we begin with T. Using basis vectors, 
we have from (6.15) 

Te 1 = T(l, 0,0) = (0,1,3 i) = e 2 + 3ie 3 = T n ei + T 2 ie 2 + T 3 ie 3 , (6.19) 


and deduce that Tn = 0, T 21 = 1, T 33 = 3 i. This can be repeated, and the rule becomes clear quickly: 
the coefficients of v % read left to right fit into the 7-th column of the matrix. Thus, we have 


/0 2 A /° 1 —3i\ 

T = 1 -i 0 and T f = 2 * 1 

\3 i 17/ \—i 0 7 / 


( 6 . 20 ) 


These matrices are related: one is the transpose and complex conjugate of the other! This is not an 
accident. 

Let us reframe this using matrix notation. Let u = e* and v = ej where e* and ej are orthonormal 
basis vectors. Then the definition ( u,Tv} = (T^u,v) can be written as 


{T ] e h ej) = (ej, Tej) 

'ki e ki e j) = i e i,Tkjek) 

( 6 . 21 ) 

= Tjkbik 

(T% = T tj 

Relabeling i and j and taking the complex conjugate we find the familiar relation between a matrix 
and its adjoint: 

(T% = (Tji)* . ( 6 . 22 ) 


The adjoint matrix is the transpose and complex conjugate matrix only if we use an orthonormal basis. 
If we did not, in the equation above the use of (ej, ej) = Sij would be replaced by (ej, ej) = <jij , where 
gij is some constant matrix that would appear in the rule for the construction of the adjoint matrix. 
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7 Hermitian and Unitary operators 

Before we begin looking at special kinds of operators let us consider a very surprising fact about 
operators on complex vector spaces, as opposed to operators on real vector spaces. 

Suppose we have an operator T that is such that for any vector v € V the following inner product 
vanishes 

(v,Tv) =0 for all v € V. (7.23) 

What can we say about the operator T? The condition states that T is an operator that starting from 
a vector gives a vector orthogonal to the original one. In a two-dimensional real vector space, this is 
simply the operator that rotates any vector by ninety degrees! It is quite surprising and important 
that for complex vector spaces the result is very strong: any such operator T necessarily vanishes. 
This is a theorem: 

Theorem: Let T be a linear operator in a complex vector space V: 

If (v , Tv) = 0 for all v € V, then T = 0. (7.24) 

Proof: Any proof must be such that it fails to work for real vector space. Note that the result 
follows if we could prove that ( u,Tv ) = 0, for all u, v € V. Indeed, if this holds, then take u = Tv, 
then (Tv, Tv) =0 for all v implies that Tv = 0 for all v and therefore T = 0. 

We will thus try to show that (u,Tv) = 0 for all u, v € V. All we know is that objects of the form 
(ifc,Tjf) vanish, whatever # is. So we must aim to form linear combinations of such terms in order 
to reproduce ( u,Tv ). We begin by trying the following 

(u + v, T(u + v)) — (u — v,T(u — v)) = 2 (u,Tv) + 2 (v,Tu). (7-25) 

We see that the “diagonal” term vanished, but instead of getting just ( u,Tv ) we also got (v,Tu). 
Here is where complex numbers help, we can get the same two terms but with opposite signs by trying, 

(u + iv, T(u + iv)) — (u — iv,T(u — iv)) = 2 i(u,Tv) — 2 i(v,Tu). (7.26) 

It follows from the last two relations that 

( u,Tv) = ^(u+u, T(u+v)) — (u—v, T(u—v)) + \(u+iv, T(u+iv)) — t(u— iv,T(u—iv)}^ . (7.27) 

The condition (v, Tv) = 0 for all v, implies that each term of the above right-hand side vanishes, thus 
showing that ( u,Tv) = 0 for all u,v € V. As explained above this proves the result. 

An operator T is said to be Hermitian if T' = T. Hermitian operators are pervasive in quantum 
mechanics. The above theorem in fact helps us discover Hermitian operators. It is familiar that the 
expectation value of a Hermitian operator, on any state, is real. It is also true, however, that any 
operator whose expectation value is real for all states must be Hermitian: 
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(7.28) 


T = if and only if (■ v,Tv ) € M for all v . 

To prove this first go from left to right. If T = 

( v,Tv ) = (T^v,v) = (Tv,v) = (v,Tv)* , (7.29) 

showing that (v, Tv) is real. To go from right to left first note that the reality condition means that 

(v,Tv) = (Tv,v) = (v,T^v) t (7.30) 

where the last equality follows because (Tt)t = T. Now the leftmost and rightmost terms can be 
combined to give (v, (T — T^)v) = 0, which holding for all v implies, by the theorem, that T = . 

We can prove two additional results of Hermitian operators rather easily. We have discussed earlier 
the fact that on a complex vector space any linear operator has at least one eigenvalue. Here we learn 
that the eigenvalues of a hermitian operator are real numbers. Moreover, while we have noted that 
eigenvectors corresponding to different eigenvalues are linearly independent, for Hermitian operators 
they are guaranteed to be orthogonal. Thus we have the following theorems 
Theorem 1: The eigenvalues of Hermitian operators are real. 

Theorem 2: Different eigenvalues of a Hermitian operator correspond to orthogonal eigenfunctions. 

Proof 1: Let v be a nonzero eigenvector of the Hermitian operator T with eigenvalue A: Tv = Xv. 
Taking the inner product with v we have that 

(v,Tv) = {v, Xv) = X(v,v). (7-31) 

Since T is hermitian, we can also evaluate (v,Tv) as follows 

(v, Tv) = (Tv, v) = (Xv, v) = X* (v, v). (7.32) 

The above equations give (X — X*)(v, v) = (1 and since v is not the zero vector, we conclude that A* = A, 
showing the reality of A. 

Proof 2: Let v\ and V 2 be eigenvectors of the operator T: 

Tv i = Aiui, Tv2 = X 2 V2 , (7.33) 

with Ai and A 2 real (previous theorem) and different from each other. Consider the inner product 
(v 2 ,Tv\) and evaluate it in two different ways. First 

(v 2 ,Tv 1 ) = (v 2 ,XiVi) = Xi(v 2 ,vi), (7.34) 
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and second, using hermiticity of T, 

(v 2 ,Tv i) = (Tv 2 ,v i) = (X 2 v 2 , v\) = X 2 {v 2 ,vi ). (7.35) 

From these two evaluations we conclude that 

(Ai — X 2 )(vi,v 2 ) = 0 (7.36) 

and the assumption Ai / A 2 , leads to (vi,v 2 ) = 0, showing the orthogonality of the eigenvectors. 

Let us now consider another important class of linear operators on a complex vector space, the so- 
called unitary operators. An operator U £ £(V) in a complex vector space V is said to be a unitary 
operator if it is surjective and does not change the magnitude of the vector it acts upon: 

\Uu\ = |it|, for all u £ V . (7.37) 

We tailored the definition to be useful even for infinite dimensional spaces. Note that U can only kill 
vectors of zero length, and since the only such vector is the zero vector, nulU7 = 0, and U is injective. 
Since U is also assumed to be surjective, a unitary operator U is always invertible. 

A simple example of a unitary operator is the operator XI with A a complex number of unit-norm: 
|A| = 1. Indeed \XIu\ = |Au| = |A||u| = |it| for all u. Moreover, the operator is clearly surjective. 

For another useful characterization of unitary operators we begin by squaring (7.37) 

(Uu, Uu) = (u, u) (7.38) 

By the definition of adjoint 

(u,U^U u) = (u,u) —> (u , (U^U — I)u) = 0 for all u. (7.39) 

So by our theorem U'U = /, and since U is invertible this means U' is the inverse of U and we also 
have UU^ = I: 

(7.40) 

Unitary operators preserve inner products in the following sense 

( Uu , Uv) = (u,v). (7.41) 

This follows immediately by moving the second U to act on the first input and using U'U = I. 

Assume the vector space V is finite dimensional and has an orthonormal basis (ei,... e n ). Consider 
the new set of vectors (/i,..., f n ) where the /’s are obtained from the e’s by the action of a unitary 
operator U: 

fi = U*. (7.42) 
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This also means that e t = U' fa. We readily see that the /’s are also a basis, because they are linearly 
independent: Acting on a\ f\ + ... + a n f n = 0 with U t we find aiei + ... + a n e n = 0, and thus a* = 0. 
We now see that the new basis is also orthonormal: 

{fa, fa) = (Uei, U efa) = (ei,ej) = 5 tj . (7.43) 

The matrix elements of U in the e-basis are 

U ki = (e k ,U ei ). (7.44) 

Let us compute the matrix elements U' ki of U in the /-basis 

U' ki = (f k , U fa) = (Ue k ,Ufa) = (e k ,fa) = {e k ,Uei) = U ki (7.45) 

The matrix elements are the same! Can you find an explanation for this result? 
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